Growing network model for community with group structure 
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We propose a growing network model for a community witli a group structure. The community consists of 
individual members and groups, gatherings of members. The community grows as a new member is introduced 
by an existing member at each time step. The new member then creates a new group or joins one of the groups 
of the introducer. We investigate the emerging community structure analytically and numerically. The group 
size distribution shows a power law distribution for a variety of growth rules, while the activity distribution 
follows an exponential or a power law depending on the details of the growth rule. We also present an analysis 
of empirical data from on the online communities, the "Groups" in http : / /www . yahoo . com and the "Cafe" in 
http : //www . daum . net which shows a power law distribution for a wide range of group sizes. 

PACS numbers: 89.75.Hc, 89.75.Fb, 05.65.+b 



I. INTRODUCTION 

Emergent properties of artificial or natural complex sys- 
tems attract growing interests recently. Some of them are 
conveniently modeled with a network, where constituting in- 
gredients and interactions are represented with vertices and 
links, respectively. Watts and Strogatz demonstrated that real- 
world networks display the small-world effect and the clus- 
tering property, which cannot be explained with the regu- 
lar and random networks yj. Later on, in the study of the 
WWW network, Albert et at. found that the degree, the num- 
ber of attached links, of each vertex follows a power-law dis- 
tribution |2]. Those works trigger a burst of researches on 
the structure and the organization principle of complex net- 
works (see Refs.J30|5| for reviews). 

Many real-world networks, e.g., in biological, social, and 
technological systems, are found to obey the power-law de- 
gree distribution |3]. A network with the power-law distribu- 
tion is called a scale-free (SF) network. One of the possible 
mechanism for the power law is successfully explained with 
the Barabasi- Albert (BA) model |6]. The model assumes that 
a network is growing and that the rate acquiring a new link for 
an existing vertex is proportional to a popularity measured by 
its degree. The popularity-based growth appears very natural 
since, e.g., creating a new web site, one would link it prefer- 
entially to popular sites having many links. With the BA and 
related network models, structural and dynamical properties 
of networks have been explored extensively. 

On the other hand, there exists another class of networks 
which have a group structure. Consider, for example, on- 
line communities such as the "Groups" operated by the Ya- 
hoo (http : //www . yahoo . comi and the "Cafes" operated by 
the Korean portal site Daum (http : / / www . damn . net|. They 
consist of individual members and groups, gatherings of mem- 
bers with a common interest, and growth of the community is 
driven not only by members but also by groups. A community 
evolves as an individual registers as a new member. The new 
comers can create new groups with existing members or joins 
existing groups. The online community is a rapidly growing 
social network |7]. The emerging structure would be distinct 



from that observed in networks without the group structure. 
In this paper, we propose a growing network model for the 
community with the group structure. We model the commu- 
nity with a bipartite network consisting of two distinct kinds 
of vertices representing members and groups, respectively. A 
link may exist only between a member vertex and a group ver- 
tex, which represents a membership relation. 

The bipartite network has been considered in the study 
of the movie actor network fT] consisting of actors and 
movies, the scientific collaboration network |8, 9] of scien- 
tists and articles, and the company director network of di- 
rectors and boards of directors. Usually those networks are 
treated as unipartite by projecting out one kind of vertices of 
less interest IkAII ill . Some biological and social networks are 
known to have a modular structure (Hill, where vertices 
in a common module are densely connected while vertices in 
different modules are sparsely connected. The modular struc- 
ture is coded implicitly in the connectivity between vertices. 
Unipartite network models with the modular structure were 
also studied in Refs. where ver- 

tices form modules which in turn form bigger modules hier- 
archically or the modular structure emerges dy- 
namically as a result of social interactions fl^ [ItI fisl fl9ll . 
In Ref. 1 19|, each vertex is assigned to a Potts-spin-like vari- 
able pointing to its module fl9ll . These studies on the group 
structures of networks have mainly focused on the groups with 
finite number of members. However, there are groups in the 
real-world online community which keep growing as the com- 
munity evolves. 

Reflecting growing dynamics of the real-world online com- 
munity, our model takes account of the group structure explic- 
itly with a bipartite network consisting of member and group 
vertices. Upon growing, both the member and group vertices 
evolve in time. We study the dynamics of the size of groups 
and the activity of the members. The size of a group is de- 
fined as the number of members in the group and the activity 
of a member is the number of groups in which the member 
participates. When the community grows large enough, the 
group size distribution shows a power law distribution unlike 
the network models studied previously 1,14. il9il . To test our 



model, we analyze the empirical data from on the online com- 
munities, the "Groups" in http : //www . yahoo . com and the 
"Cafe" in [http : //www . daum . net and show that both com- 
munities indeed show power law group size distributions for 
wide ranges of group sizes. 

This paper is organized as follows. In Sec. lUl we intro- 
duce the growing network model. Depending on the choice 
of detailed dynamic rules, one may consider a few variants of 
the model. Characteristics such as the group size distribution, 
the member activity distribution, and the growth of the num- 
ber of groups are studied analytically in a mean field theory 
and numerically in Sec. [Ill] Those characteristics are also cal- 
culated for the real-world online communities and compared 
with the model results. We conclude the paper with summary 
in Sec.HVI 

II. MODEL 

We introduce a model for a growing community with the 
group structure. The community grows by adding a new mem- 
ber at a time, who may open a new group or join an existing 
group 1 20]. Following notations are adopted: A member en- 
tering the community at time step ; is denoted by The ac- 
tivity, the number of participating groups, of /, is denoted by 
A,. As members enter the community, new groups are created 
or existing groups expand. The ath group is denoted by Ga, 
its creation time by Xa, and its size by Sa- The total number 
of members and groups is denoted by and M, respectively. 

Initially, at time t — 0, the community is assumed to be 
inaugurated by mo members, denoted by /_(„,g_i), ■■■,Iq, be- 
longing to an initial group Gi. That is, we have that N{t = 
0) = mo, M(f = 0) = 1 , Aj{t = 0) = 1 for y = - (mo - 1 ),•••, 0, 
Ti = 0, and Si (f = 0) = mo. At time f , a new individual is 
introduced into the community and becomes a member by re- 
peating the following procedures until its activity reaches m: 

• Selection : It selects a partner Ij among existing mem- 
bers {4<r} with a selection probability Pj. 

• Creation or Joining : With a creation probability P^, 
it creates a new group Gm+i with the partner /y. Other- 
wise, it selects randomly one of the groups of Ij with the 
equal probability and joins it. If /, is already a member 
of the selected group, then the procedure is canceled. 

A specific feature of the model varies with the choice of 
those probabilities and P'- . Regarding to the selection, sim- 
plest is the random choice among existing members with the 
equal probability f J = 1 / (mo + f — 1 ) . Note that the selection 
may be regarded as an invitation of a new member by existing 
members. Then, it may be natural to assume that active mem- 
bers invite more newcomers. Such a case is modeled with a 
preferential selection probability —Aj/ (Y.kKt'^k)- After se- 
lecting a partner Ij, the newcomer may create a new group or 
join one of I/s groups with the equal probability. In that case 
the creation probability is variable as Pj — 1/ [Aj + 1). In the 
other case, it may create a new group with a fixed probability 
Pj- — CO. Combining the strategies in the two procedures, we 



2 




FIG. 1: A network for the RV model with mo = m= I and A' = 10 
with six groups. The symbol (T) and | a | represents a member /, and 
a group Ga, respectively. 




FIG. 2: (color online) A network for the RV model with mo = = 1 
and A' = 1000. A square (circle) symbol stands for a group (member). 



consider the possible four different growth models denoted by 
RV, RF, PV, and PF, respectively. Here, R (P) stands for the 
random (preferential) selection, and V (F) for the group cre- 
ation with the variable (fixed) probability. For example, the 
RF model has the selection probability, P^ = 1/ (mo + f — 1 ) 
and the creation probability, Pj = I / {Aj + 1). The growth 
rules are summarized in Table|l] 

The whole structure of the community is conveniently rep- 
resented with a bipartite network of two kinds of vertices; one 
for the group and the other for the member. A link exists only 
from a member vertex to a group vertex to which it belongs. 
The member activity and the group size correspond to the de- 
gree of the corresponding vertex. Figure ^ shows a typical 
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TABLE I: Model description and mean field results for the group 
size distribution exponent y- Here, &kv and &pv are the group num- 
ber growth rate given in Eqs. (SJ and <17t . respectively. The activity 
distribution follows a power law only for the PF model with the ex- 
ponent X = 2 + 1 /o). 
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network configuration for the RV model with mo = m = 1. 
To help readers understand the growth dynamics, we add the 
indices for members /,■ and groups Ga in the figure. It is eas- 
ily read off that /i selects Iq and becomes a member of Gi at 
t — 1 and that I2 opens a new group G2 with /q at f = 2, and 
so on. Figure |2] shows a configuration of a RV network with 
m = mo — 1 grown up to = 1000 members with M — 452 
groups. It is noteworthy that there appear hub groups having 
a lot of members. The emerging structure of the network will 
be studied in the next section. 



III. NETWORK STRUCTURE 

The number of groups M(f ), the activity of each member 
A, (f ), and the size of each group Sa{t) increase as the network 
grows. With those quantities, we characterize the growth dy- 
namics and the network structure. In the following, we study 
the dynamics of those quantities averaged over network re- 
alizations. For simplicity's sake, we make use of the same 
notations for the averaged quantities. The network dynamics 
implies that they evolve in time as follows: 



A,{t- 
M{t- 



1) = A,(f)+mPfPf 
1) = M{t)+mY,PjPf 



Sait+l) = Sait)- 



-m'£pfXjail-Pf)/Aj, 



(1) 
(2) 

(3) 



where %ja = 1 if /; belongs to Ga or otherwise. The ini- 
tial conditions are given by A, (f = /) = m, M{t = 0) = 1, and 
Sa{t = ta) =2 with la the creation time of Ga- We analyze 
the equations in a continuum limit and in a mean field scheme, 
neglecting any correlation among dynamic variables. 

Firstly we consider the RV model. Using the corresponding 
and in Table ID Eqs. ([112131 become 



dAi 
dt 
dM 
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dSg 
dt 



(A, + l)(mo + f) 
1 m 

K + f),%(A; + l) 



1 



mo- 



mo- 



I 



(A,- + l) 



(4) 



(5) 



(6) 



where we approximate Xja in Eq. Q with i-;^^, the fraction 
of members of Ga among all members. The solution for A, (r) 



is given by 



A;(f) = -1 + J(m+ l)2 + 2mln 



mo + t 



mo + i 



(7) 



It shows that an older member with smaller / has a larger ac- 
tivity and that the activity grows very slowly in time. With the 
solution for A, one can easily show that Ly<, 'w/(Aj + 1) ~ 
&Rv{mo + 1) for large f with 



du- 



yj{m+\y-~2m\nu 



(8) 



Hence, the average number of groups increases linearly in 
time as M{t) ~ 0«yf with the group number growth rate @rv. 
The group size increases algebraically as 



Sa{t) 



mo+Xa 



(9) 



We have obtained the activity of each member and the size 
of each group, which allow us to derive the distribution func- 
tion Pa{A) and Ps{S) for the activity and the group size, re- 
spectively. The activity distribution function is given by the 
relation Pa (A) — Pi„{i)\di/dA\ with the uniform individual 
distribution, P,>,(/) = l/(mo + f). The differentiation can be 
done through Eq. 0, which yields that the activity distribu- 
tion is bounded as Pa{A) = (A + l)exp{ — ((A + 1)^ - (m + 
l)^)/(2m)}/m. Similarly, the group size distribution is given 
by Ps{S) — Pa{l)\d'l/dS\ with the group creation time distri- 
bution Pai'x)- We assume that the group creation time is dis- 
tributed uniformly, which is justified with the linear growth of 
M ~ &Kv{mQ +t). Then the group size distribution follows a 
power law Ps{S) ^ S^'^'^^ with the exponent 



yRv = l+ ©«v 



(10) 



Note that the distribution exponent is determined by the group 
number growth rate &rv- 

We now turn to the PF model. With the selection and cre- 
ation probabilities, Eqs. ( I1I2I3> are written as 



dAj 

dt 

dM 

dt 
dSg 

dt 



OTCoA,- 



mco 



(1 - CO)5o 



(11) 
(12) 
(13) 



We also took the approximation X/a = Sa/ {mo+t) in Eq. Q. 
Trivially we find that the group number grows in time as 
M{t) — m(Ot + 1. For A; and Sa, one need evaluate the quan- 
tity Ly<,Ay. Summing over all ; both sides of Eq. (^3^ one 
obtains that Li<r ("^^//"^O = '"W- Note that d{Y.i<rAi)/dt ~ 
Y.i<tidAi/dt) +m — (1 +co)m, which yields that (Lj<rAy) — 
m(l + co)f + mo- Hence we obtain the algebraic growth of the 
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I size as 



activity and the group 

/ m(l +co)f + OTo 

Ai(t) = m — ^- 

\m(l +CO)! + mo 

_ 2 ^ m(l +CO)f + mo 



1-0) 



5a(f) 



(14) 
(15) 



m(l +a))fa + 'Mo / 

These results allow us to find the distribution functions Pa{A) 
and P,{S). They follow the power distribution P„(A) A^'^'''' 
and Ps (5') 5'^^'''" with the exponents 

XpF ^2+1/03 and Y/>f = 2/(1 - co) . (16) 

Here we also assumed the uniform distribution of Xa in 
Eq. il5\ . which is supported from the linear growth of M{t) ^ 
max. In contrast to the RV model, both distributions follow 
the power-law. The exponents do not depend on the parame- 
ter m, but only on the group creation probability co. 

For the PV and the RF model, the followings can be shown 
easily: The PV model behaves similarly as the RV model. 
The group number increases linearly in time as M{t) ~ ©pyf 
with the group number growth rate @pv- Unfortunately, we 
could not obtain a closed form expression for it. However, if 
we adopt the assumption that the selection probability Pf is 
proportional to A, + 1 instead of A,-, it can be evaluated analyt- 
ically as 



®PV ~ ( V + 6m + 1 - (m+ 1)) /2 



(17) 



The approximation would become better for larger values of 
m. The group size grows algebraically as in Eq. (|9} with 0/>v 
instead of @rv- Therefore, the group size distribution follows 
the power-law with the exponent ypv presented in Table|l] The 
RF model also displays the power-law group size distribution. 
The distribution exponent '^pp is given in Table U Note that 
Yff/r and '^pp are the same. On the other hand, the activity 
distribution follows an exponential distribution in the RF and 
the PV model. 

Origin for the power-law distribution of the group size is 
easily understood. In all models considered, the size of a 
group increases when one of its members invites a new mem- 
ber. The larger a group is, the more chance to invite new mem- 
bers it has. Therefore there exists the preferential growth in 
the group size, which is known to lead to the power-law dis- 
tribution |6|. 

The activity of a member increases when a newcomer se- 
lects it and creates a new group. When the random selection 
probability is adopted, such a process does not occur prefer- 
entially for members with higher activity. It results in the ex- 
ponential type activity distribution in the RV and RF models. 
In the PV model, although the selection probability is pro- 
portional to the activity, the creation probability is inversely 
proportional to the activity. Hence, it does not have the prefer- 
ential growth mechanism in the member activity either. Only 
in the PF model, the activity growth rate is proportional to the 
activity of each member Therefore, the activity distribution 
follows the power-law only in the PF model. 
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FIG. 3: (a) The group size distribution and (b) the activity distri- 
bution. The model parameters are m = 4,1 for the RV and the PV 
model, respectively. The RF model has ra = 4 and co = 0.6, and the 
PF model has m = 4 and co = 0.5. The community has grown up to 
A' = 10* and the distributions are averaged over 10^ samples. 
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FIG. 4: (a) Numerical results for y for the RV and the PV model. The 
solid (dashed) curve represents the analytic mean field results for the 
RV (PV) model, (b) Numerical results for y (open symbols) of the 
RF and the PF model, and for A. (filled symbols) of the PF model. 
The solid (dashed) curve represents the analytic results for j (X) in 
TableEl 



The analytic mean field results are compared with numeri- 
cal simulations. In simulations, we chose mo — m and all data 
were obtained after the average over at least 10000 samples. 
We present the numerical data in Fig.|3] In accordance with 
the mean field results, the group size distribution follows the 
power-law in all cases. The activity distribution also shows 
the expected behavior; the power-law distribution for the PF 
model and exponential type distributions for the other models. 
We summarize the distribution exponents in Fig |3 The mea- 
sured values of the distribution exponents are in good agree- 
ment with the analytic results. 

Our network models display distinct behaviors from those 
bipartite networks such as the movie actor network, the sci- 
entific collaboration networks, and the director board network 
which have been studied previously. For the first two exam- 
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FIG. 5: Cumulative group size distribution of the online communities 
in the Yahoo and the Daum. 

pies, their growth is driven only by the member vertices, the 
actors and the scientists, respectively. The activity of mem- 
bers may increase in time. However, the group vertices, the 
movies and the papers, respectively, are frozen dynamically 
and their sizes are bounded practically. For the last exam- 
ple, both the members (directors) and the groups (boards) may 
evolve in time. However, it was shown that the group size dis- 
tribution is also bounded |8]. 

Our model is applicable to evolving networks with the 
group structure where the size of a group may increase un- 
limitedly. The online community is a good example of 
such networks. To test the possibility, we study the empir- 
ical data obtained from the Groups and the Cafe operated 
by the Yahoo in http : //www . yahoo . com and the Daum in 
|http : /7 www ■ datim .net respectively. It is found in August, 
2004 that there are 1,516,750 (1,743,130) groups (cafes) with 
76,587,494 (351,565,837) cumulative members in the Ya- 
hoo (Daum) site. The numbers of members of the groups are 
available via the web sites. Figure |5] presents the cumulative 
distribution P> (5) = Ts'>sPs{S') of the group size. The distri- 
bution has a fat tail \22\. Although the distribution function in 
the log-log scale show a nonnegligible curvature in the entire 
range, it can still be fitted reasonable well into the power law 
for a range over two decades (see the straight lines drawn in 



Fig-E). From the fitting, we obtain the group size distribution 
exponents yyahoo — 2.8 and yDaum —2.15. The power-law scal- 
ing suggests that the online community may be described by 
our network model. Unfortunately, information on the activity 
distribution is not available publicly. So we could not compare 
the activity distribution of the communities with the model 
results. We would like to add the following remark: A real- 
world online community evolves in time as new members are 
introduced to and new groups are created. At the same time, it 
also evolves as members leave it and groups are closed. Those 
processes are not incorporated into the model. Our model is a 
minimal model for the online community where the effects of 
leaving members and closed groups are neglected. 

IV. SUMMARY 

We have introduced the bipartite network model for a grow- 
ing community with the group structure. The community con- 
sists of members and groups, gatherings of members. Those 
ingredients are represented with distinct kinds of vertices. 
And a membership relation is represented with a link between 
a member and a group. Upon growing a group increases 
its size when one of its members introduces a new mem- 
ber. Hence, a larger group grows preferentially faster than 
a smaller group. With the analytic mean field approaches and 
the computer simulations, we have shown that the preferential 
growth leads to the power-law distribution of the group size. 
On the other hand, the activity distribution follows the power- 
law only for the PF model with the preferential selection prob- 
ability and the fixed creation probability (see Table H}. We 
have also studied the empirical data obtained from the online 
communities, the Groups of the Yahoo and the Cafe of the 
Daum. Both communities display the power-law distribution 
of the group size. It suggests our network model be useful in 
studying their structure. 
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